perm filename MQ[4,KMC]1 blob sn#006503 filedate 1972-10-25 generic text, type T, neo UTF8
00100	 CAN EXPERT JUDGES DISTINGUISH PARANOID PATIENTS FROM A
00200	COMPUTER MODEL OF PARANOID THOUGHT USING TRANSCRIPTS OF TELETYPED PSYCHIATRIC
00300	INTERVIEWS?                  
00310	              COLBY AND HILF
00400	
00500	
00600	MOST   PSYCHIATRISTS   DO  NOT  READ  THE  LITERATURE  IN  ARTIFICIAL
00700	INTELLIGENCE, NOR IS THERE  ANY  NEED  TO.  NEVERTHELESS IN  1971  THERE
00800	APPEARED  IN  THAT  LITERATURE  A DESCRIPTION OF A CASE OF ARTIFICIAL
00900	PARANOIA [ ].  THE CASE CONSISTED OF A  COMPUTER  SIMULATION  OF  THE
01000	PARANOID MODE OF INTERACTION IN A PSYCHIATRIC INTERVIEW.
01100	
01200	TO  SIMULATE A PROCESS ON  A  COMPUTER IS TO CONSTRUCT  AN ALGORITHM OR  
01300	COMPUTER PROGRAM WHICH CAN REPRODUCE THE PHENOMENA CHARACTERISITIC
01400	OF  THE  PROCESS  UNDER  CONSIDERATION.  A  SIMULATION  IS SAID TO BE
01500	SUCCESSFUL WHEN ITS BEHAVIOR IN SOME CONTEXT IS INDISTINGUISHABLE  BY
01600	EMPIRICAL  TESTS  FROM  THE PROCESS IT IS SIMULATING. THE REPRODUCTION
01700	ACHIEVED BY THE ALGORITHM IS NOT ONE OF SIMPLE MIMICRY. IT IS ACHIEVED BY
01800	POSTULATING A STRUCTURE OF UNDERLYING MECHANISMS CAPABLE OF GENERATING THE BEHAVIOR IN
01900	QUESTION. THE UNDERLYING STRUCTURE THEN CONSTITUES AN EXPLANATION OF THE OBSERVED PHENOMENA.
02100	   ONE METHOD FOR TESTING WHETHER A  SIMULATION  HAS  ACHIEVED  A  SUCCESSFUL
02200	REPRODUCTION CONSISTS OF USING  INDISTINGUISHABILITY  TESTS . SOME FAMILIAR
02300	EXAMPLES FROM OTHER DISCIPLINES INVOLVE DISTINGUISHING  NATURAL  FROM
02400	SYNTHETIC  INSULIN,  NATURAL  FROM  ARTIFICIAL  FIBERS,  NATURAL FROM
02500	SYNTHETIC  DIAMONDS. FOR INSTANCE,  WHEN  SYNTHETIC  HORMONES   SHOW   SOME   (NOT
02600	NECESSARILY   ALL)   OF   THE   BIODYNAMIC  EFFECTS  OF  THE  NATURAL
02700	COUNTERPART,E.G.LOWERING OF BLOOD SUGAR IN THE CASE OF  INSULIN, THEN
02800	THE  ARTIFACTS  ARE  CONSIDERED INDISTINGUISHABLE IN RESPECT TO THESE
02900	EFFECTS  AND  THUS  SUCCESSFUL   IN   ACHIEVING   THE   SYNTHESIZER'S
03000	PURPOSES. IN  SUCH CASES PHYSICAL  MEASUREMENTS  ARE  USED  TO
03100	DEMONSTRATE DEGREES OF INDISTINGUISHABILITY.  WHEN  HUMAN  JUDGEMENTS
03200	ARE  USED AS MEASURES OF THE INDISTINGUISHABILITY, A NUMBER OF STICKY PROBLEMS
03300	ARISE.
03400	
03500	LET US CONSIDER THE EXAMPLE OF JUDGING WINE  SINCE IT IS A WELL-STUDIED
03600	EXAMPLE OF USING  HUMAN JUDGES WHO MAKE DISCRIMINATIONS USING INTERNAL
03700	MODELS. EXPERT WINE JUDGES HAVE THE TESTED  ABILITY  TO  DISTINGUISH,
03800	FOR  INSTANCE',FRENCH WINE FROM CALIFORNIA WINE, CHARDONNAY FROM PINOT
03900	NOIR. THEY HOLD CONCEPTUAL MODELS ABOUT WHAT THESE WINES  LOOK  LIKE,
04000	SMELL  LIKE AND TASTE LIKE. IN JUDGING A PARTICULAR WINE THEY COMPARE
04100	THEIR SENSORY INPUT ALONG THESE DIMENSIONS WITH  STANDARDS  OF  THEIR
04200	CONCEPTUAL MODEL. WHEN SOME SORT OF MATCH OR MISMATCH IS ATTAINED, AN
04300	IDENTIFICATION OF THE WINE IS MADE. THEN CHECKS ARE MADE  TO  SEE  IF
04400	THE IDENTIFICATION IS CORRECT. AN EXPERT JUDGE IS ONE WHO SUCCEEDS IN
04500	MAKING CORRECT IDENTIFICATIONS IN SOME  HIGH  PERCENTAGE  OF  TRIALS.
04510	
04520	
04600	WHEN  A  COMPUTER SIMULATION OF A PSYCHOLOGICAL PROCESS IS ATTEMPTED,
04700	ONE WAY OF TESTING THE SIMULATION IS TO USE EXPERT JUDGES. (FOR OTHER
04800	WAYS,  SEE  ABELSON  [  ]).  IF  EXPERT JUDGES CANNOT DISTINGUISH THE
04900	SIMULATION FROM ITS NATURAL COUNTERPART, THEN THE SIMULATION  MAY  BE
05000	DEEMED  SUCCESSFUL  AS  MEASURED  BY  THE  INDISTINGUISHABILITY TESTS
05100	UTILIZED. THE IMPORTANT QUESTIONS HERE ARE: WHAT IS AN EXPERT JUDGE?,
05200	WHAT   TEST   DOES   HE   USE?,    ARE   THERE  EXPLANATIONS  FOR  THE
05300	INDISTINGUISHABILIITY OTHER THAN THAT THE SIMULATION PROVIDES A  GOOD
05400	REPRODUCTION OF THE BEHAVIOR UNDER INVESTIGATION?
05500	
05600	IN  JUDGING  WINE  WE CAN DETERMINE WHO IS AN EXPERT BY COMPARING HIS
05700	IDENTIFICATIONS WITH WHAT IS ACTUALLY THE CASE. SIMILARLY CAN WE ESTABLISH BY TESTS
05800	WHO QUALIFIES AS AN EXPERT JUDGE OF CERTAIN HUMAN BEHAVIORS? PROBABLY SO
05900	BUT IN THE DOMAIN OF PSYCHIATRY AND PSYCHOPATHOLOGY WE ALREADY HAVE CERTIFIED  EXPERTS
06000	SUCH  AS  PSYCHIATRISTS,  SOME OBVIOUSLY  BEING  MORE  EXPERT  THAN OTHERS. ONE
06100	DIFFICULTY HERE IN ESTABLISHING EXPERTISE IS THE RELIABILITY OF  WHAT
06200	IS  BEING  JUDGED. THAT IS, CAN CONSENSUS ABOUT PATHOLOGICAL BEHAVIOR
06300	BE ACHIEVED ? WE KNOW THAT MANY OF THE DIAGNOSTIC CATEGORIES USED ARE
06400	UNRELIABLE  IN THE SENSE THAT ONLY LOW LEVELS OF INTERJUDGE AGREEMENT
06500	CAN BE REACHED. ONE EXCEPTION  IS THAT  INVOLVING THE CATEGORY  OF
06600	PARANOIA. WE HAVE SHOWN THAT EVEN WHEN THE DATA BEING JUDGED CONSISTS
06700	OF TRANSCRIPTS OF INITIAL PSYCHIATRIC INTERVIEWS IN WHICH PATIENT AND
06800	PSYCHIATRIST COMMUNICATE BT MEANS OF REMOTELY LOCATED TELETYPES, HIGH
07000	LEVELS  OF  AGREEMENT  (96%)  CAN  BE REACHED AMONG RANDOMLY SELECTED
07100	PSYCHIATRISTS. [ ]  (PERHAPS  PSYCHIATRISTS  SHOULD  MAKE  ALL  THEIR
07200	DIAGNOSES BY TELETYPE?)
07300	
07400	SUPPOSE,  HOWEVER, THE PATIENT BEING INTERVIEWED BY TELETYPE ACTUALLY
07500	CONSISTS  OF  A  COMPUTER  SIMULATION  OF  A  PARANOID  PATIENT.   DO
07600	PSYCHIATRISTS  AGREE  ABOUT  'HIS' PARANOIA? IN SEVERAL TESTS WE HAVE
07700	DEMONSTRATED THAT THEY DO. [ ] BUT IN THESE TESTS THE JUDGES WERE NOT
07800	INFORMED  THAT  ONE  OF  THE  PATIENTS  WAS  A COMPUTER MODEL. SO THE
07900	QUESTION ARISES:
07910	
07920	QUESTION 1: IF INFORMED THAT  OF TWO TELEYPED PSYCHIATRIC INTERVIEWS
08000	ONE  WAS  WITH A COMPUTER MODEL AND ONE WITH AN ACTUAL PATIENT, CAN A
08100	PSYCHIATRIST TELL WHICH IS WHICH?
08200	
08400	
08500	PSYCHIATRISTS IN THE SAN FRANCISCO BAY AREA CONDUCTED INTERVIEWS OVER
08600	TELETYPES  WITH  HOSPITALIZED  PARANOID  PATIENTS AND WITH A COMPUTER
08700	SIMULATION OF PARANOIA. THE INTERVIEWERS WERE NOT INFORMED  THAT  ONE
08800	OF THE INTERVIEWEES WAS A SIMULATION. PROTOCOLS OF THESE INTERVIEWS, ONE WITH A PARANOID
08900	PATIENT AND ONE WITH OUR PARANOID MODEL WERE  SENT  TO  100  RANDOMLY
09000	SELECTED  MEMBERS  OF  THE  AMERICAN PSYCHIATRIC ASSOCIATION WHO WERE
09100	ALSO  LISTED  IN  THE  DIRECTORY  OF  MEDICAL  SPECIALISTS  AS  BOARD
09200	CERTIFIED IN PSYCHIATRY.
09300	
09400	THE JUDGES WERE ASKED TO DECIDE WHICH INTERVIEW WAS WITH A PATIENT AND
09500	WHICH WITH THE COMPUTER MODEL. FORTY ONE  JUDGES  RESPONDED  WITH  21
09600	(51%)   MAKING   THE  CORRECT  IDENTIFICATION  WHILE  20  (49%)  WERE
09700	INCORRECT. SINCE THE RESULTS WERE NO BETTER THAN CANCE, THE SIMULATION APPEARS TO BE
09800	INDISTINGUISHABLE FROM NATURALLY OCCURRING PARANOID LINGUISTIC BEHAVIOR.
09900	
10100	
10200	BECAUSE A SIMULATION IS INDISTINGUISHABLE BY A TEST SUCH AS DESCRIBED,
10300	IT DOES NOT NECESSARILY FOLLOW THAT THE PARTICULAR  SIMULATION  IS  A
10400	GOOD  ONE.  OTHER  HYPOTHESES  MIGHT  BE  GENERATED  TO  EXPLAIN  THE
10500	INDISTINGUISHABILITY.
10600	
10700	THE QUESTION THEN MIGHT BE ASKED: 
10710	
10720	QUESTION 2: WOULD PSYCHIATRISTS CONFUSE ANY
10800	COMPUTER  PROGRAM  WHOSE  OUTPUT CONSISTS OF CONVERSATIONAL RESPONSES
10900	WITH THE VERBAL BEHAVIOR OF A MENTALLY ILL PATIENT?  
10910	
10920	TO ANSWER THIS QUESTION WE CONSTRUCTED WHAT WE  CONSIDERED  TO  BE  A
11020	POOR   SIMULATION  OF  PARANOIA  BEHAVIOR  WHICH  UTILIZED  ONLY  TWO
11120	MECHANISMS- (1) THE OUTPUT RESPONSES WERE RANDOMLY SELECTED FROM  THE
11220	SET OF ALL THE RESPONSES THE ORIGINAL MODEL WAS CAPABLE OF MAKING AND
11320	(2) NO RESPONSE WAS ALLOWED  TO  BE  OUTPUT  MORE  THAN  ONCE  IN  AN
11420	INTERVIEW. WE TERMED THIS THE 'POOR' MODEL.
11600	
11700	TWO LOCAL  PSYCHIATRISTS  INTERVIEWED  THIS  POOR MODEL  BY  MEANS  OF
11800	TELETYPE. THE  INTERVIEWS ARE PRESENTED FOR ILLUSTRATION IN
11810	FIGURES 1 AND 2. THE MODEL IS POOR BECAUSE MOST OF THE RESPONSES ARE
11820	IRRELEVANT AND INAPPROPRIATE. IT IS WORTH NOTING THAT AT TIMES THE
11830	RANDOMLY CHOSEN RESPONSE IS QUITE ACCEPTABLE, PARTICULARLY WHEN
11840	A REFERENCE IS MADE TO THE INTERVIEWER.TWO INTERVIEWS, ONE WITH A PARANOID PATIENT
12000	AND  ONE  WITH  THE  RANDOM  MODEL  WERE SENT TO ANOTHER 100 RANDOMLY
12100	SELECTED BOARD CERTIFIED MEMBERS OF THE AMERICAN PSYCHIATRIC ASSOCIATION.    (RESULTS
12200	GO HERE)
12300	
12500	
12600	WE WERE SURPRISED BY THE FINDING THAT IN THE POOR MODEL CERTAIN
12700	RESPONSES ARE QUITE APPROPRIATE TO THE QUESTIONS ASKED.  IT  MUST  BE
12800	REMEMBERED  THAT THE RANDOM SELECTIONS WERE MADE FROM A FINITE SET OF
12900	A FEW HUNDRED STATEMENTS TYPICAL OF PARANOID PATIENTS. THE SELECTION WAS NOT FROM A LARGE SET OF
13000	RANDOMLY CHOSEN PATIENT AND NONPATIENT  STATEMENTS.THIS METHOD MIGHT BE EXPECTED TO YIELD
13100	EVEN GREATER DEGREES OF DISCONTINUITY AND INAPPROPRIATENESS IN THE INTERVIEW.
13500	
13600	IF JUDGES WERE UNABLE TO DISTINGUISH THE POOR MODEL FROM THE HUMAN PATIENT,
13700	IT WOULD INDICATE THAT THEY PAY LITTLE ATTENTION TO SEQUENTIAL BEHAVIOR.
13800	IF THE CORRECT DISTINCTION WERE NOT MADE, IT MIGHT IMPLY THAT THE TEST
13900	IS WEAK OR THAT THE JUDGES ARE BAD OR THAT THE POOR MODEL IS NOT POOR
14000	ENOUGH OR WE HAD BAD LUCK IN GENERATING THE POOR-MODEL INTERVIEWS OR ANY
14100	COMBINATION OF THESE FACTORS.
14600	
14700	IT MIGHT BE SAID THAT WHILE PSYCHIATRISTS ARE EXPERT JUDGES IN
14800	FACE TO FACE SITUATIONS WITH PATIENTS, THEY ARE NOT AS EXPERT IN
14900	MAKING JUDGEMENTS ON TELETYPED DATA WHICH IS SPARSE AND LACKING IN
15000	NONVERBAL INFORMATION. ALSO THEY WOULD NOT BE EXPECTED TO BE ABLE
15100	TO IDENTIFY A COMPUTER PROGRAM SINCE IT IS AN UNFAMILIAR OBJECT
15200	IN THEIR EXPERIENCE. IF WE ASK ARE THERE OTTHER EXPERT JUDGES, COMPUTER SCIENTISTS COME TO MIND.
15300	QUESTION 3: WOULD COMPUTER EXPERTS BE ABLE TO DISTINGUISH THE POOR MODEL
15310	
15400	FROM A HUMAN PATIENT?
15500	
15800	THE TWO INTERVIEWS WHICH WERE SENT TO PSYCHIATRISTS, ONE WITH THE
15900	POOR MODEL AND ONE WITH A PATIENT, WERE ALSO SENT TO 100 RANDOMLY
16000	SELECTED MEMBERS OF THE ASSOCIATION FOR COMPUTING MACHINERY (ACM).
16100	(RESULTS HERE)
20100	
20200	
20300	
20400	
20500	
20600	
20700	
20800	
20900	
21000	
21100	
21200	
21300	
21400	
21500	
21600	
21700	
21800	
21900	
22000	
22100	                      CONCLUSION
22200	
22300	THE DATA FROM THESE TESTS INDICATE THAT RELEVANT EXPERT JUDGES,
22400	PSYCHIATRISTS AND COMPUTER SCIENTISTS, DO NOT DISTINGUISH A `GOOD'
22500	SIMULATION OF PARANOID LINGUISTIC BEHAVIOR FROM THAT OF ACTUAL PARANOID
22600	PATIENTS. PSYCHIATRISTS WERE ABLE TO DISTINGUISH PARANOID PATIENTS FROM
22700	A `POOR' SIMULATION. THE TERM `GOOD' HERE MEANS ONLY THAT THE MODEL PASSED
22800	CERTAIN TESTS AND HENCE IS `NOT BAD'. THE MODEL HAS SEVERAL OBVIOUS SHORTCOMINGS
22900	PARTICULARLY IN ITS DEFICIENT UNDERSTANDING OF ENGLISH. AT THIS POINT WE
23000	FEEL THE MODEL REPRESENTS A GOOD START IN THE RIGHT DIRECTION. ONLY AFTER
23100	A GREAT DEAL MORE WORK IN IMPROVING THE MODEL WILL WE BE ABLE TO SAY, NOT
23200	THAT IT IS PERFECT, BUT THAT IT IS THE BEST AVAILABLE EXPLANATION OF
23300	PARANOID THOUGHT `FOR THE TIME BEING'.